Hierarchical loop scheduling for clustered NUMA machines
نویسندگان
چکیده
Loop scheduling is an important issue in the development of high performance multiprocessors. As modern multiprocessors have high and non-uniform memory access (NUMA) costs, the communication costs dominate the execution of parallel programs. Previous anity algorithms perform better than dynamic algorithms under non-clustered NUMA multiprocessors, but they suer heavy overheads when migrating work load under clustered NUMA machines. In this paper, we propose a new loop scheduling policy, hierarchical policy, to improve various anity scheduling algorithms (AFSs) for clustered NUMA machines. We cyclically distribute the iteration chunks to clusters. When imbalance occurs, the migration of iterations is carried on hierarchically. We use hierarchical policy to improve AFS and modi®ed AFS (MAFS), and we call them Hierarchical AFS (HAFS) and Hierarchical MAFS (HMAFS), respectively. AFS uses a deterministic assignment policy to assign repeated executions of loop iteration to the same processor. MAFS modi®es the migration policy of AFS, and reduces the number of synchronization operations. We con®rm our idea by running many applications under a clustered NUMA simulator. Our experimental result shows that hierarchical policy reduces the inter-cluster remote memory accesses, decreases the locks to the queues, and eectively balances the work load. We also show that HMAFS is the best choice among these algorithms in most cases. Ó 2000 Elsevier Science Inc. All rights reserved.
منابع مشابه
Clustered affinity scheduling on large-scale NUMA multiprocessors
Modern shared-memory multiprocessors have high and non-uniform memory access (NUMA) costs. The communication cost gradually dominates the source of parallel applications’ execution. Algorithms based on affinity, like affinity scheduling algorithm (AFS), perform better than dynamic algorithms, such as guided self-scheduling (GSS) and trapezoid selfscheduling (TSS). However, as the number of proc...
متن کاملScheduling Dynamic OpenMP Applications over Multicore Architectures
Approaching the theoretical performance of hierarchical multicore machines requires a very careful distribution of threads and data among the underlying non-uniform architecture in order to minimize cache misses and NUMA penalties. While it is acknowledged that OpenMP can enhance the quality of thread scheduling on such architectures in a portable way, by transmitting precious information about...
متن کاملLiterature Study NUMA-Aware Thread Scheduling On Hierarchical Multiprocessor Machines
As the use of computer for massive computation keeps increasing, the applications always request more and more performances. To hold that pace, computer architects have proposed always more evolved hardware solutions to get computers powerful beyond imagination. But when dealing with high performance computing, users generally have quite large imagination, and the effect of such a challenge res...
متن کاملA Hierarchical Production Planning and Finite Scheduling Framework for Part Families in Flexible Job-shop (with a case study)
Tendency to optimization in last decades has resulted in creating multi-product manufacturing systems. Production planning in such systems is difficult, because optimal production volume that is calculated must be consistent with limitation of production system. Hence, integration has been proposed to decide about these problems concurrently. Main problem in integration is how we can relate pro...
متن کاملAn Efficient OpenMP Runtime System for Hierarchical Arch
Exploiting the full computational power of always deeper hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-uniform architecture. The emergence of multi-core chips and NUMA machines makes it important to minimize the number of remote memory accesses, to favor cache affinities, and to guarantee fast completion of synchronization...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Systems and Software
دوره 55 شماره
صفحات -
تاریخ انتشار 2000